Author: Julien Dorier, Bioinformatics Competence Center, University of Lausanne, 1015 Lausanne, Switzerland
Terminology is illustrated in Figure 1:
The cell tracking graph is created by assigning one vertex to each labelled region (one vertex per time frame) and one edge between each overlapping pair of labelled region (considering only pairs of labelled region separated by not more than time frames). The overlap area is stored as an edge attribute. An example of cell tracking graph obtained with is shown in Figure 2.
Note:
is called "Max delta frame" in the GUI and max_delta_frame
in the code.
As illustrated in Figure 2, labelled region are usually obtained independently for each frame and labels (colors) do not match across time frames. To obtain consistent labels across time, we use the following iterative relabelling approach (see Figure 3):
Labelled regions (vertices) in the first time frame () are arbitrarily relabelled using consecutive integer labels , , , (Figure 3A).
We then iterate over the remaining time frames (Figure 3B-F).
At time frame (), we evaluate a "confusion matrix", containing the total overlap area between each labelled region (vertex) at time frame t and each labelled region (vertex) at previous time frames. Note that all previous time frames (, , ) have already been relabelled. Each row of the confusion matrix corresponds to a labelled region (vertex) in the current frame t, while each column corresponds to a labelled region (vertex) in previous frames (, , , ). The entry in the -th row and -th column contain sum of the overlap areas between the labelled region (with label corresponding to row ) and all labelled regions (with label corresponding to column ) in the previous time frames , , , . It is evaluated by summing the overlap areas stored as edge attribute for all edges connecting the vertex with label corresponding to row at time frame t to all vertices with label corresponding to column at previous time frames (dark grey edges in Figure 3).
One to one matching between labels in the current time frame (rows) and labels in previous time frames (columns) is found using linear sum assignment (scipy implementation, using the opposite of the confusion matrix as cost matrix). Labels in the current time frame are replaced by the matching labels in previous time frames. If the confusion matrix contains more rows than columns, some labels in current time frame do not match any labels in previous time frames. In this case, unmatched labels are replaced by new labels , , (with the maximum label found across all previous time frames).
The resulting relabelled cell tracking graph and mask are shown in Figure 4
The final cell tracking graph is obtained by (see Figure 5):
Removing edges corresponding to low overlap: Edges with an overlap area smaller than a predefined fraction (20% by default) of any the labelled region area corresponding to the source or target vertices (dashed edges in Figure 5).
Adding missing edges between vertices with same label: If two consecutive (in time) vertices with same label are not connected by an edge, then add the missing edge between these vertices (Figure 6). This could happen after removing low overlap edges.
Removing redundant edges: Intuitively, a redundant edges is an edge enclosing another edge (light grey edges in Figure 5). More precisely, an edge connecting a vertex with label at time frame to a vertex with label at time frame () is considered as redundant if there is at least one other edge connecting a vertex with label at time frame to a vertex with label at time frame and if and .
The resulting final cell tracking graph is shown in Figure 7.
Note: The threshold
used to filter out edges corresponding to low overlap is called "Min
overlap fraction" in the GUI and min_overlap_fraction in
the code.
Cell segmentation tools (e.g. cellpose) may occasionally produce errors (e.g. time frames 3 and 4 in Figure 8A). When segmentation errors are punctual and surrounded (in time) by stable and error-free time frames, it is possible to correct the error using the information contained in neighboring time frames ("interpolating" neighboring time frames).
To "interpolate" a set of labelled regions (labels 1=green and 2=red in Figure 8) over a range of time frames (t=3 and t=4 in Figure 8) using the information contained in neighbouring frames on each side ( by default), we start by evaluating the signed distance map for each labelled region and each time frame separately (Figure 8B and 8D). Each pixel in the distance map contain pixel the signed distance from this pixel to the boundary of the labelled region, with positive distance inside the labelled region and negative distances outside. For each selected labelled region and each selected time frame, we evaluate the median of all distance maps for the labelled region at time frames within of the selected time frame (Figure 8C and 8E).
For each selected time frame, new labelled regions are obtained by assigning each pixel to the label with highest positive median distance map (Figure 8F). Pixels with negative median distance for all labelled regions are assigned to the background.
The final mask is obtained by erasing (i.e setting to background label) selected labelled regions in selected time frames (Figure 8G) and pasting non-background pixels from the new labelled regions (Figure 8F) onto the final mask (Figure 8H).
Note:
is called "Max delta frame (interpolation)" in the GUI and
max_delta_frame_interpolation in the code.
The previous "mask interpolation" method can be applied to any user selected set of labelled regions and time frames. However, it may not give reasonable results if the labelled regions are not stable enough within time frames around the frames to be corrected.
Intuitively, this method should work best for punctual segmentation errors (i.e. spanning only few time frames), surrounded by enough stable and error-free time frames. To automatically find this type of segmentation defects, we use the information contained in the cell tracking graph:
Stable regions: Intuitively, a stable region of the cell tracking graph is a series of vertices in consecutive time frames (no missing time frame), all corresponding to the same mask label, each with a unique incoming and a unique outgoing edge (no branching).
More precisely, we start by flagging all edges of the cell tracking graph as stable or not stable. An edge is defined as stable if and only if it satisfies all the following criteria (Figure 9):
Examples of not stable edges are shown in Figure 9 (blue letters):
Stable regions are found as the connected components of the subgraph induced by all stable edges. The size of each stable region is measured as the number of vertices in connected component (Figure 10).
Punctual defects: Candidate defects are found as the connected components of the subgraph induced by all not stable edges (ignoring trivial components of size 1).
This initial list of candidate defects (connected components) is then filtered out to keep only punctual defects surrounded by stable regions, satisfying all the following criteria (see Figure 11):
Figure 11 and 12 show exemples of candidate defects. For and , the following candidate defects would be filtered out:
Finally, for each remaining defect in the list, the mask interpolation method described in the previous section is applied to the list of all labels appearing in the defect, for all time frames covered by the defect except first and last time frame (Figure 13).
Notes:
stable_overlap_fraction in the code.nframes_defect
in the code.nframes_stable
in the code.For the GUI: